Background and Context
In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term.
The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor could lead to better crop yields, the freeing up of human involvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.
The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has provided the data containing images of unique plants belonging to 12 different species. You being a data scientist, need to build a Convolutional Neural Network model which would classify the plant seedlings into their respective 12 categories.
This dataset contains images of unique plants belonging to 12 different species.
The data file names are:
images.npy
Label.csv
Due to the large volume of data, the images were converted to numpy arrays and stored in images.npy file and the corresponding labels are also put into Labels.csv so that you can work on the data/project seamlessly without having to worry about the high data volume.
The goal of the project is to create a classifier capable of determining a plant's species from an image.
List of all the 12 Plant species in the dataset
Black-grass
Charlock
Cleavers
Common Chickweed
Common Wheat
Fat Hen
Loose Silky-bent
Maize
Scentless Mayweed
Shepherds Purse
Small-flowered Cranesbill
Sugar beet
##Connect to Google drive to read and write to files
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
#Libraries to help with reading and manipulating data
import numpy as np
import pandas as pd
import cv2
#Libraries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns
#Command to tell python to actually display the graphs
%matplotlib inline
#Importing ImageGrid to plot the images
from mpl_toolkits.axes_grid1 import ImageGrid
# Display images using OpenCV
from google.colab.patches import cv2_imshow # Importing cv2_imshow from google.patches to display images
#For one hot encoding the labels
from sklearn.preprocessing import LabelBinarizer
# Tensorflow modules
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential, Model # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam # Importing the optimizers which can be used in our model
from sklearn.model_selection import train_test_split # Importing train_test_split function to split the data into train and test
from sklearn.metrics import classification_report,confusion_matrix # Importing confusion_matrix to plot the confusion matrix
# Clearing backend
from tensorflow.keras import backend
from keras.applications.vgg16 import VGG16
# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
Loading the dataset
# Load the image file of the dataset
images = np.load('/content/drive/MyDrive/Colab Notebooks/images.npy')
# Load the labels file of the dataset
labels = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Labels.csv')
Overview of the dataset
print(images.shape)
print(labels.shape)
(4750, 128, 128, 3) (4750, 1)
There are 4,750 RGB images of shape 128 x 128 X 3, each image having 3 channels.
Let us inspect the labels
labels.sample(5)
| Label | |
|---|---|
| 1153 | Shepherds Purse |
| 1364 | Common wheat |
| 3166 | Scentless Mayweed |
| 4637 | Loose Silky-bent |
| 1214 | Common wheat |
num_categories = labels['Label'].nunique()
num_categories
12
We have 4,750 labels with 12 unique values.
# Let us list all the 12 unique plant species
categories = labels.Label.unique()
categories
array(['Small-flowered Cranesbill', 'Fat Hen', 'Shepherds Purse',
'Common wheat', 'Common Chickweed', 'Charlock', 'Cleavers',
'Scentless Mayweed', 'Sugar beet', 'Maize', 'Black-grass',
'Loose Silky-bent'], dtype=object)
labels.value_counts()
Label Loose Silky-bent 654 Common Chickweed 611 Scentless Mayweed 516 Small-flowered Cranesbill 496 Fat Hen 475 Charlock 390 Sugar beet 385 Cleavers 287 Black-grass 263 Shepherds Purse 231 Common wheat 221 Maize 221 dtype: int64
Out of the 4,750 image samples:
we have 654 samples labeled as Loose Silky-bent - this variety seems to be more in number compared to the rest of the 11 species.
We have 221 samples each of Common wheat and Maize, and these 2 samples are the least in the dataset.
This goes to indicate an imbalance in the dataset.
cv2_imshow(images[5])
plt.imshow(images[5])
<matplotlib.image.AxesImage at 0x7f5c1d7e1990>
We can observe that the images are being shown in different colors when plotted with openCV and matplotlib as OpenCV reads images in BGR format and this shows that the given numpy arrays were generated from the original images using OpenCV.
Now we will convert these BGR images to RGB images so we could interpret them easily.
# Converting the images from BGR to RGB using cvtColor function of OpenCV
for i in range(len(images)):
images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
Function to generate labeled Count Plots of the 12 unique types of seedlings available in the dataset
# function to create countplots with percentages labled on the bars
def labeled_countplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 3))
else:
plt.figure(figsize=(n + 1, 3))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
labeled_countplot(labels,"Label", perc=True)
Observation:
"Loose Silky bent" and Common Chickweed samples are more compared to other categories - they make up 13.8% and 12.9% of the data.
We only have 4.7% of "Common Wheat" and "Maize" samples in the dataset.
There is an imbalance in the dataset.
Displaying the images of all the 12 plant categories
#Function to display all the 12 types of seedlings available in the dataset
def plot_images(images, labels):
# Chose indices in order to cover all the 12 species
index_list = [1, 500, 1423, 4745, 3612, 2711,971, 3227, 3833, 2034, 2424, 1202, 4100, 27, 143]
keys = dict(labels['Label'])
rows = 3
cols = 4
fig = plt.figure(figsize=(10, 8))
for i, index in enumerate(index_list):
if i >= rows * cols: # Break the loop if all desired images are plotted
break
ax = fig.add_subplot(rows, cols, i + 1)
ax.imshow(images[index])
ax.set_title(keys[index])
plt.show()
plot_images(images, labels)
As the size of the images is large, it may be computationally expensive to train on these larger images; therefore, it is preferable to reduce the image size from 128 to 64.
images_decreased=[]
height = 64
width = 64
dimensions = (width, height)
for i in range(len(images)):
images_decreased.append( cv2.resize(images[i], dimensions, interpolation=cv2.INTER_LINEAR))
plt.imshow(images_decreased[3])
<matplotlib.image.AxesImage at 0x7f5c15434fd0>
# Applying Gaussian Blur to de-noise the images
images_gb=[]
for i in range(len(images_decreased)):
images_gb.append(cv2.GaussianBlur(images_decreased[i], ksize =(3,3),sigmaX = 0))
plt.imshow(images_gb[3])
<matplotlib.image.AxesImage at 0x7f5b34173250>
It appears that GaussianBlur would be ineffective because the blurred or denoised image does not seem to contain any relevant information, and the model would struggle to categorize these blurred images.
Displaying images of the 12 categories after preprocessing
plot_images(images_decreased, labels)
Plotting the resized images of the 12 plant categories in a 12X12 grid
#defining a figure of size 12X12
fig = plt.figure(1, figsize=(num_categories,num_categories))
grid = ImageGrid(fig, 111, nrows_ncols=(num_categories,num_categories), axes_pad=0.05)
i = 0
index = labels.index
#Plottting resized images of the 12 seedling categories
for category_id, category in enumerate(categories):
condition = labels["Label"] == category
plant_indices = index[condition].tolist()
for j in range(0,12):
ax = grid[i]
#Using the resized images
ax.imshow(images_decreased[plant_indices[j]])
ax.axis('off')
if i % num_categories == num_categories - 1:
#printing the names for each caterogy
ax.text(200, 70, category, verticalalignment='center')
i += 1
plt.show();
Resizing the images did not seem to have impacted the quality of the images.
We are using the train_test_split() function from scikit-learn. Here, we split the dataset into three parts, train,test and validation.
20% of the data is set aside for test/unseen data. The remaining is split into training(90%) and validation(10%) datasets.
X_temp, X_test, y_temp, y_test = train_test_split(np.array(images_decreased),labels , test_size=0.2, random_state=1,stratify=labels)
X_train, X_val, y_train, y_val = train_test_split(X_temp,y_temp , test_size=0.1, random_state=1,stratify=y_temp)
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
(3420, 64, 64, 3) (3420, 1) (380, 64, 64, 3) (380, 1) (950, 64, 64, 3) (950, 1)
We are using 3,420 images to train the model, 950 images to test the model. 380 images are used to make the validation dataset.
y_train.value_counts()
Label Loose Silky-bent 471 Common Chickweed 440 Scentless Mayweed 372 Small-flowered Cranesbill 357 Fat Hen 342 Charlock 281 Sugar beet 277 Cleavers 206 Black-grass 189 Shepherds Purse 167 Common wheat 159 Maize 159 dtype: int64
y_test.value_counts()
Label Loose Silky-bent 131 Common Chickweed 122 Scentless Mayweed 103 Small-flowered Cranesbill 99 Fat Hen 95 Charlock 78 Sugar beet 77 Cleavers 58 Black-grass 53 Shepherds Purse 46 Common wheat 44 Maize 44 dtype: int64
Treating class imbalance by increasing the weights of the minority classes.
from sklearn.utils import class_weight
labelList = labels.Label.unique()
class_weights = class_weight.compute_class_weight(class_weight = "balanced",
classes = np.array(labelList),
y = y_train.values.reshape(-1)
)
class_weights = dict(zip(np.array(range(len(labelList))), class_weights))
#print calculated class weights
class_weights
{0: 0.7983193277310925,
1: 0.8333333333333334,
2: 1.7065868263473054,
3: 1.7924528301886793,
4: 0.6477272727272727,
5: 1.0142348754448398,
6: 1.383495145631068,
7: 0.7661290322580645,
8: 1.0288808664259927,
9: 1.7924528301886793,
10: 1.507936507936508,
11: 0.6050955414012739}
# Convert labels from names to one hot vectors.
# Labelbinarizer works similar to onehotencoder
encoder = LabelBinarizer()
y_train_encoded = encoder.fit_transform(y_train)
y_val_encoded=encoder.transform(y_val)
y_test_encoded=encoder.transform(y_test)
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
(3420, 64, 64, 3) (3420, 1) (380, 64, 64, 3) (380, 1) (950, 64, 64, 3) (950, 1)
Visualise the one hot encoded labels in y_train, y_test & y_val
print(" y_train_encoded[0]:", y_train_encoded[0])
print(" y_test_encoded[10]:", y_test_encoded[10])
print(" y_val_encoded[100]:", y_val_encoded[100])
y_train_encoded[0]: [0 0 0 0 0 0 1 0 0 0 0 0] y_test_encoded[10]: [0 1 0 0 0 0 0 0 0 0 0 0] y_val_encoded[100]: [0 0 0 0 0 0 0 1 0 0 0 0]
Since the image pixel values range from 0-255, our method of normalization here will be scaling - we shall divide all the pixel values by 255 to standardize the images to have values between 0-1.
# Normalizing the image pixels
X_train_normalized = X_train.astype('float32')/255.0
X_val_normalized = X_val.astype('float32')/255.0
X_test_normalized = X_test.astype('float32')/255.0
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
(3420, 64, 64, 3) (3420, 1) (380, 64, 64, 3) (380, 1) (950, 64, 64, 3) (950, 1)
Data is in the shape compatible with Keras models and does not need to be reshaped.
Let's create a CNN model sequentially, where we will be adding the layers one after another.
First, we need to clear the previous model's history from the session even if a single model can run multiple times on the same data.
In Keras, we need a special command to clear the model's history, otherwise the previous model history remains in the backend.
Also, let's fix the seed again after clearing the backend.
Let's set the seed for random number generators in Numpy, the Random library in Python, and in TensorFlow to be able to reproduce the same results every time we run the code.
# Clearing backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
Now, let's build a CNN Model with the following 2 main parts -
The Feature Extraction layers which are comprised of convolutional and pooling layers.
The Fully Connected classification layers for prediction.
Building Model1
# Intializing a sequential model
model1 = Sequential()
# Adding first conv layer with 64 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension of images
model1.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
# Adding max pooling to reduce the size of output of first conv layer
model1.add(MaxPooling2D((2, 2), padding = 'same'))
model1.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model1.add(MaxPooling2D((2, 2), padding = 'same'))
# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model1.add(Flatten())
# Adding a fully connected dense layer with 100 neurons
model1.add(Dense(100, activation='relu'))
# Adding the output layer with 12 neurons (12 distinct labels) and activation function as softmax since this is a multi-class classification problem
model1.add(Dense(12, activation='softmax'))
optimizer=Adam()
# Compile model
model1.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
# Generating the summary of the model
model1.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 64) 1792
max_pooling2d (MaxPooling2D (None, 32, 32, 64) 0
)
conv2d_1 (Conv2D) (None, 32, 32, 32) 18464
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
flatten (Flatten) (None, 8192) 0
dense (Dense) (None, 100) 819300
dense_1 (Dense) (None, 12) 1212
=================================================================
Total params: 840,768
Trainable params: 840,768
Non-trainable params: 0
_________________________________________________________________
Fitting model1 on training data
history_1 = model1.fit(
X_train_normalized, y_train_encoded,
epochs=30,
class_weight=class_weights, #using the class_weights generated to handle imbalanced data
validation_data=(X_val_normalized,y_val_encoded),
batch_size=64,
verbose=2
)
Epoch 1/30 54/54 - 2s - loss: 2.7327 - accuracy: 0.1778 - val_loss: 2.2195 - val_accuracy: 0.3263 - 2s/epoch - 42ms/step Epoch 2/30 54/54 - 0s - loss: 2.0088 - accuracy: 0.3766 - val_loss: 1.6488 - val_accuracy: 0.4342 - 499ms/epoch - 9ms/step Epoch 3/30 54/54 - 1s - loss: 1.4478 - accuracy: 0.5322 - val_loss: 1.2788 - val_accuracy: 0.5947 - 504ms/epoch - 9ms/step Epoch 4/30 54/54 - 0s - loss: 1.1863 - accuracy: 0.6035 - val_loss: 1.2744 - val_accuracy: 0.5789 - 498ms/epoch - 9ms/step Epoch 5/30 54/54 - 0s - loss: 1.0481 - accuracy: 0.6532 - val_loss: 1.1389 - val_accuracy: 0.6237 - 496ms/epoch - 9ms/step Epoch 6/30 54/54 - 0s - loss: 0.9083 - accuracy: 0.7073 - val_loss: 0.9883 - val_accuracy: 0.6868 - 494ms/epoch - 9ms/step Epoch 7/30 54/54 - 0s - loss: 0.7908 - accuracy: 0.7450 - val_loss: 0.9785 - val_accuracy: 0.6895 - 497ms/epoch - 9ms/step Epoch 8/30 54/54 - 0s - loss: 0.6714 - accuracy: 0.7863 - val_loss: 0.9617 - val_accuracy: 0.6763 - 500ms/epoch - 9ms/step Epoch 9/30 54/54 - 1s - loss: 0.5965 - accuracy: 0.8050 - val_loss: 0.9474 - val_accuracy: 0.7026 - 504ms/epoch - 9ms/step Epoch 10/30 54/54 - 1s - loss: 0.5646 - accuracy: 0.8167 - val_loss: 0.9743 - val_accuracy: 0.7026 - 510ms/epoch - 9ms/step Epoch 11/30 54/54 - 1s - loss: 0.4521 - accuracy: 0.8576 - val_loss: 0.9302 - val_accuracy: 0.7105 - 503ms/epoch - 9ms/step Epoch 12/30 54/54 - 1s - loss: 0.3508 - accuracy: 0.8895 - val_loss: 0.9699 - val_accuracy: 0.7184 - 507ms/epoch - 9ms/step Epoch 13/30 54/54 - 1s - loss: 0.2991 - accuracy: 0.9070 - val_loss: 1.1112 - val_accuracy: 0.6658 - 506ms/epoch - 9ms/step Epoch 14/30 54/54 - 1s - loss: 0.2619 - accuracy: 0.9211 - val_loss: 1.0414 - val_accuracy: 0.7105 - 506ms/epoch - 9ms/step Epoch 15/30 54/54 - 1s - loss: 0.2478 - accuracy: 0.9234 - val_loss: 1.3835 - val_accuracy: 0.6368 - 505ms/epoch - 9ms/step Epoch 16/30 54/54 - 0s - loss: 0.2130 - accuracy: 0.9322 - val_loss: 1.0987 - val_accuracy: 0.7105 - 496ms/epoch - 9ms/step Epoch 17/30 54/54 - 1s - loss: 0.1331 - accuracy: 0.9635 - val_loss: 1.1415 - val_accuracy: 0.7132 - 501ms/epoch - 9ms/step Epoch 18/30 54/54 - 0s - loss: 0.1041 - accuracy: 0.9722 - val_loss: 1.1645 - val_accuracy: 0.7211 - 498ms/epoch - 9ms/step Epoch 19/30 54/54 - 0s - loss: 0.0889 - accuracy: 0.9804 - val_loss: 1.1964 - val_accuracy: 0.7158 - 496ms/epoch - 9ms/step Epoch 20/30 54/54 - 1s - loss: 0.0764 - accuracy: 0.9801 - val_loss: 1.2911 - val_accuracy: 0.7079 - 521ms/epoch - 10ms/step Epoch 21/30 54/54 - 1s - loss: 0.0570 - accuracy: 0.9877 - val_loss: 1.3391 - val_accuracy: 0.7263 - 518ms/epoch - 10ms/step Epoch 22/30 54/54 - 1s - loss: 0.0350 - accuracy: 0.9953 - val_loss: 1.3605 - val_accuracy: 0.7184 - 525ms/epoch - 10ms/step Epoch 23/30 54/54 - 1s - loss: 0.0267 - accuracy: 0.9968 - val_loss: 1.3742 - val_accuracy: 0.7237 - 519ms/epoch - 10ms/step Epoch 24/30 54/54 - 1s - loss: 0.0202 - accuracy: 0.9988 - val_loss: 1.4313 - val_accuracy: 0.7263 - 529ms/epoch - 10ms/step Epoch 25/30 54/54 - 1s - loss: 0.0163 - accuracy: 0.9988 - val_loss: 1.3822 - val_accuracy: 0.7237 - 514ms/epoch - 10ms/step Epoch 26/30 54/54 - 1s - loss: 0.0129 - accuracy: 0.9994 - val_loss: 1.4515 - val_accuracy: 0.7158 - 507ms/epoch - 9ms/step Epoch 27/30 54/54 - 0s - loss: 0.0107 - accuracy: 0.9994 - val_loss: 1.5078 - val_accuracy: 0.7237 - 500ms/epoch - 9ms/step Epoch 28/30 54/54 - 0s - loss: 0.0084 - accuracy: 0.9994 - val_loss: 1.5072 - val_accuracy: 0.7342 - 497ms/epoch - 9ms/step Epoch 29/30 54/54 - 0s - loss: 0.0074 - accuracy: 0.9997 - val_loss: 1.5223 - val_accuracy: 0.7289 - 497ms/epoch - 9ms/step Epoch 30/30 54/54 - 0s - loss: 0.0061 - accuracy: 1.0000 - val_loss: 1.5771 - val_accuracy: 0.7342 - 500ms/epoch - 9ms/step
The model seems to be overfitting - with ~100% accuracy in the training data set vs ~73% accuracy in the validation dataset.
plt.plot(history_1.history['accuracy'])
plt.plot(history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
Observations:
We can see from the above plot that the training accuracy of the model was good but the validation accuracy was not good.
The model seems to overfit on the data.
accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=2)
30/30 - 0s - loss: 1.7497 - accuracy: 0.6811 - 98ms/epoch - 3ms/step
The model1 is giving ~68% accuracy on the test/unseen data.
# Here we would get the output as probablities for each category
y_pred1=model1.predict(X_test_normalized)
30/30 [==============================] - 0s 2ms/step
y_pred1
array([[2.2987963e-26, 2.8384855e-13, 5.5256862e-11, ..., 2.0011310e-06,
9.9993277e-01, 5.3156980e-13],
[3.7000701e-02, 1.0624351e-11, 1.2506959e-09, ..., 3.8056331e-15,
3.1669536e-08, 7.5564143e-04],
[3.9459584e-26, 2.3855588e-05, 3.3100338e-07, ..., 5.1825293e-05,
9.9992144e-01, 9.1526939e-11],
...,
[7.9226840e-16, 2.9878849e-12, 1.5901062e-14, ..., 1.5506890e-04,
9.4369838e-11, 4.2638578e-07],
[2.0926935e-10, 1.0312399e-04, 9.9885273e-01, ..., 6.2650279e-04,
1.5650109e-07, 2.5037190e-04],
[1.6908931e-08, 5.5975024e-02, 9.2789501e-01, ..., 1.4270269e-03,
6.5461843e-04, 1.2929792e-02]], dtype=float32)
Plotting the Confusion Matrix
The Confusion matrix is also defined as an inbuilt function in the TensorFlow module, so we can use that for evaluating the classification model.
The Confusion matrix expects categorical data as input. However, y_test_encoded is an encoded value, whereas y_pred has probabilities. So,we must retrieve the categorical values from the encoded values.
We will use the argmax() function to obtain the maximum value over each category on both y_test_encoded and y_pred and obtain their respective classes.
# Obtaining the categorical values from y_test_encoded and y_pred1
y_pred1_arg=np.argmax(y_pred1,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred1_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
Observations
We observe that some of the classes are not predicted correctly.
In comparison to the rest, we can see that classes 10 and 2 are well classified.
We can also observe that classes 0,4, 5 and 6 have comparatively higher misclassifications. 0 is being misclasified as 6 and vice-versa.
# Print the metrics
print(classification_report(y_test_arg,y_pred1_arg))
precision recall f1-score support
0 0.30 0.26 0.28 53
1 0.77 0.76 0.76 78
2 0.81 0.72 0.76 58
3 0.87 0.83 0.85 122
4 0.45 0.45 0.45 44
5 0.70 0.57 0.63 95
6 0.64 0.72 0.68 131
7 0.74 0.64 0.68 44
8 0.69 0.76 0.72 103
9 0.65 0.48 0.55 46
10 0.74 0.87 0.80 99
11 0.56 0.64 0.59 77
accuracy 0.68 950
macro avg 0.66 0.64 0.65 950
weighted avg 0.68 0.68 0.68 950
The model (model 1) precision/recall/f1 scores are low for class 0 as well as class 4.
As we can see, our initial model appears to overfit. Therefore we'll try to address this problem with data augmentation, Batch Normalization and Dropout. Let us also add more dense layers and see if it helps with the predictions.
In most of the real-world case studies, it is challenging to acquire a large number of images and then train CNNs. To overcome this problem, one approach we might consider is Data Augmentation. CNNs have the property of translational invariance, which means they can recognise an object even if its appearance shifts translationally in some way. Taking this attribute into account, we can augment the images using the techniques listed below -
Also, we cannot use data augmentation on the validation/test data sets.
# Creating new data
train_datagen = ImageDataGenerator(shear_range = 0.2,rotation_range=180, # randomly rotate images in the range
zoom_range = 0.1, # Randomly zoom image
width_shift_range=0.1, # randomly shift images horizontally
height_shift_range=0.1, # randomly shift images vertically
horizontal_flip=True, # randomly flip images horizontally
vertical_flip=True # randomly flip images vertically
)
# Clearing backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
# Intializing a sequential model
model2 = Sequential()
# Adding first conv layer with 64 filters and kernel size 3x3 , padding 'same' provides the output size same as the input size
# Input_shape denotes input image dimension images
model2.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
# Adding max pooling to reduce the size of output of first conv layer
model2.add(MaxPooling2D((2, 2), padding = 'same'))
model2.add(BatchNormalization())
model2.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model2.add(MaxPooling2D((2, 2), padding = 'same'))
model2.add(BatchNormalization())
# flattening the output of the conv layer after max pooling to make it ready for creating dense connections
model2.add(Flatten())
# Adding a fully connected dense layer with 128 neurons
model2.add(Dense(128, activation='relu'))
#Introducing Dropout to addess overfitting
model2.add(Dropout(0.3))
model2.add(Dense(64, activation='relu'))
model2.add(Dropout(0.3))
model2.add(Dense(32, activation='relu'))
# Adding the output layer with 12 neurons and activation functions as softmax since this is a multi-class classification problem
model2.add(Dense(12, activation='softmax'))
# Compile model
model2.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
# Generating the summary of the model
model2.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 64) 1792
max_pooling2d (MaxPooling2D (None, 32, 32, 64) 0
)
batch_normalization (BatchN (None, 32, 32, 64) 256
ormalization)
conv2d_1 (Conv2D) (None, 32, 32, 32) 18464
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
batch_normalization_1 (Batc (None, 16, 16, 32) 128
hNormalization)
flatten (Flatten) (None, 8192) 0
dense (Dense) (None, 128) 1048704
dropout (Dropout) (None, 128) 0
dense_1 (Dense) (None, 64) 8256
dropout_1 (Dropout) (None, 64) 0
dense_2 (Dense) (None, 32) 2080
dense_3 (Dense) (None, 12) 396
=================================================================
Total params: 1,080,076
Trainable params: 1,079,884
Non-trainable params: 192
_________________________________________________________________
history2 = model2.fit(train_datagen.flow(X_train_normalized,y_train_encoded,
batch_size=64,
seed=1,
shuffle=False),
epochs=30,
steps_per_epoch=X_train_normalized.shape[0] // 64,
class_weight = class_weights,
validation_data=(X_val_normalized,y_val_encoded),
verbose=1)
Epoch 1/30 53/53 [==============================] - 7s 77ms/step - loss: 2.7410 - accuracy: 0.2339 - val_loss: 2.4311 - val_accuracy: 0.1368 Epoch 2/30 53/53 [==============================] - 4s 76ms/step - loss: 2.1642 - accuracy: 0.3504 - val_loss: 2.4676 - val_accuracy: 0.1368 Epoch 3/30 53/53 [==============================] - 4s 78ms/step - loss: 1.8802 - accuracy: 0.4178 - val_loss: 2.5501 - val_accuracy: 0.1368 Epoch 4/30 53/53 [==============================] - 4s 75ms/step - loss: 1.7948 - accuracy: 0.4446 - val_loss: 3.2774 - val_accuracy: 0.1368 Epoch 5/30 53/53 [==============================] - 4s 77ms/step - loss: 1.6637 - accuracy: 0.4794 - val_loss: 3.2284 - val_accuracy: 0.1368 Epoch 6/30 53/53 [==============================] - 4s 78ms/step - loss: 1.5972 - accuracy: 0.4931 - val_loss: 4.0374 - val_accuracy: 0.1368 Epoch 7/30 53/53 [==============================] - 4s 74ms/step - loss: 1.5506 - accuracy: 0.5180 - val_loss: 2.9823 - val_accuracy: 0.1526 Epoch 8/30 53/53 [==============================] - 4s 75ms/step - loss: 1.4231 - accuracy: 0.5498 - val_loss: 3.7098 - val_accuracy: 0.1421 Epoch 9/30 53/53 [==============================] - 4s 80ms/step - loss: 1.3608 - accuracy: 0.5748 - val_loss: 3.7460 - val_accuracy: 0.1974 Epoch 10/30 53/53 [==============================] - 4s 74ms/step - loss: 1.3198 - accuracy: 0.5796 - val_loss: 3.9207 - val_accuracy: 0.2132 Epoch 11/30 53/53 [==============================] - 4s 75ms/step - loss: 1.2809 - accuracy: 0.6055 - val_loss: 2.7159 - val_accuracy: 0.2921 Epoch 12/30 53/53 [==============================] - 4s 82ms/step - loss: 1.2032 - accuracy: 0.6186 - val_loss: 1.4840 - val_accuracy: 0.5289 Epoch 13/30 53/53 [==============================] - 4s 75ms/step - loss: 1.1896 - accuracy: 0.6186 - val_loss: 2.3896 - val_accuracy: 0.4000 Epoch 14/30 53/53 [==============================] - 4s 75ms/step - loss: 1.1550 - accuracy: 0.6403 - val_loss: 1.8761 - val_accuracy: 0.5000 Epoch 15/30 53/53 [==============================] - 4s 81ms/step - loss: 1.1325 - accuracy: 0.6511 - val_loss: 1.3902 - val_accuracy: 0.5632 Epoch 16/30 53/53 [==============================] - 4s 74ms/step - loss: 1.1156 - accuracy: 0.6487 - val_loss: 2.4278 - val_accuracy: 0.4053 Epoch 17/30 53/53 [==============================] - 4s 74ms/step - loss: 1.0232 - accuracy: 0.6850 - val_loss: 1.5580 - val_accuracy: 0.5737 Epoch 18/30 53/53 [==============================] - 4s 79ms/step - loss: 0.9768 - accuracy: 0.6865 - val_loss: 1.2315 - val_accuracy: 0.5526 Epoch 19/30 53/53 [==============================] - 4s 75ms/step - loss: 0.9595 - accuracy: 0.6880 - val_loss: 1.7043 - val_accuracy: 0.5289 Epoch 20/30 53/53 [==============================] - 4s 77ms/step - loss: 0.9157 - accuracy: 0.7131 - val_loss: 1.1456 - val_accuracy: 0.5974 Epoch 21/30 53/53 [==============================] - 4s 81ms/step - loss: 0.9089 - accuracy: 0.7202 - val_loss: 0.9776 - val_accuracy: 0.6816 Epoch 22/30 53/53 [==============================] - 4s 75ms/step - loss: 0.8641 - accuracy: 0.7271 - val_loss: 1.4811 - val_accuracy: 0.4895 Epoch 23/30 53/53 [==============================] - 4s 74ms/step - loss: 0.8696 - accuracy: 0.7285 - val_loss: 1.9161 - val_accuracy: 0.4868 Epoch 24/30 53/53 [==============================] - 4s 78ms/step - loss: 0.8649 - accuracy: 0.7303 - val_loss: 1.0485 - val_accuracy: 0.6895 Epoch 25/30 53/53 [==============================] - 4s 75ms/step - loss: 0.8335 - accuracy: 0.7431 - val_loss: 0.6157 - val_accuracy: 0.7789 Epoch 26/30 53/53 [==============================] - 4s 75ms/step - loss: 0.8033 - accuracy: 0.7536 - val_loss: 1.0116 - val_accuracy: 0.6842 Epoch 27/30 53/53 [==============================] - 4s 80ms/step - loss: 0.8135 - accuracy: 0.7461 - val_loss: 2.7124 - val_accuracy: 0.3974 Epoch 28/30 53/53 [==============================] - 4s 79ms/step - loss: 0.7919 - accuracy: 0.7515 - val_loss: 1.1421 - val_accuracy: 0.6132 Epoch 29/30 53/53 [==============================] - 4s 76ms/step - loss: 0.7746 - accuracy: 0.7557 - val_loss: 0.7441 - val_accuracy: 0.7289 Epoch 30/30 53/53 [==============================] - 4s 82ms/step - loss: 0.7374 - accuracy: 0.7670 - val_loss: 1.5479 - val_accuracy: 0.6053
plt.plot(history2.history['accuracy'])
plt.plot(history2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
There is quite a bit of fluctuation in the accuracy curve plotted against the validation data.
Data Augmentation, Dropout, BatchNormalization introduced to solve overfitting do not seem to have made much difference. The model is overfitting with 77% accuracy on the training data and 61% accuracy on the validation data.
accuracy = model2.evaluate(X_test_normalized, y_test_encoded, verbose=2)
30/30 - 0s - loss: 1.5547 - accuracy: 0.6337 - 131ms/epoch - 4ms/step
The model2 is giving ~63% accuracy on the test/unseen data.
# Here we would get the output as probablities for each category
y_pred2=model2.predict(X_test_normalized)
30/30 [==============================] - 0s 2ms/step
# Obtaining the categorical values from y_pred2
y_pred2_arg=np.argmax(y_pred2,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred2_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
Observation:
We observe that quite a few classes are not predicted correctly. Barely any seedlings from class 0 seem to have been classified correctly. 34 out of the 53 samples in class 0 seem to have been misclassified as belonging to class 6.
In comparison to the rest, we can see that classes 2,7,8 are well classified.
We can also observe that classes 0,3,4,5,9,10,11 have comparatively higher misclassifications.
# Print the metrics
print(classification_report(y_test_arg,y_pred2_arg))
precision recall f1-score support
0 0.67 0.04 0.07 53
1 0.93 0.50 0.65 78
2 0.46 0.90 0.61 58
3 0.97 0.77 0.86 122
4 0.58 0.48 0.53 44
5 0.86 0.26 0.40 95
6 0.66 0.85 0.74 131
7 1.00 0.80 0.89 44
8 0.38 0.98 0.55 103
9 0.46 0.52 0.49 46
10 1.00 0.54 0.70 99
11 0.80 0.58 0.68 77
accuracy 0.63 950
macro avg 0.73 0.60 0.60 950
weighted avg 0.75 0.63 0.62 950
Class 0 has the lowest recall/f1-score(s) amongst all the 12 classes.
Let's try again, but this time, we will be using the idea of Transfer Learning. We will be loading a pre-built architecture - VGG16, which was trained on the ImageNet dataset and is the runner-up in the ImageNet competition in 2014.
For training VGG16, we will directly use the convolutional and pooling layers and freeze their weights i.e. no training will be done on them. For classification, we will replace the existing fully-connected layers with FC layers created specifically for our problem.
Let us also add more dense layers with more neurons.
Building Model3
# Clearing backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
vgg_model = VGG16(weights='imagenet', include_top = False, input_shape = (64,64,3))
vgg_model.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 64, 64, 3)] 0
block1_conv1 (Conv2D) (None, 64, 64, 64) 1792
block1_conv2 (Conv2D) (None, 64, 64, 64) 36928
block1_pool (MaxPooling2D) (None, 32, 32, 64) 0
block2_conv1 (Conv2D) (None, 32, 32, 128) 73856
block2_conv2 (Conv2D) (None, 32, 32, 128) 147584
block2_pool (MaxPooling2D) (None, 16, 16, 128) 0
block3_conv1 (Conv2D) (None, 16, 16, 256) 295168
block3_conv2 (Conv2D) (None, 16, 16, 256) 590080
block3_conv3 (Conv2D) (None, 16, 16, 256) 590080
block3_pool (MaxPooling2D) (None, 8, 8, 256) 0
block4_conv1 (Conv2D) (None, 8, 8, 512) 1180160
block4_conv2 (Conv2D) (None, 8, 8, 512) 2359808
block4_conv3 (Conv2D) (None, 8, 8, 512) 2359808
block4_pool (MaxPooling2D) (None, 4, 4, 512) 0
block5_conv1 (Conv2D) (None, 4, 4, 512) 2359808
block5_conv2 (Conv2D) (None, 4, 4, 512) 2359808
block5_conv3 (Conv2D) (None, 4, 4, 512) 2359808
block5_pool (MaxPooling2D) (None, 2, 2, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
# Making all the layers of the VGG model non-trainable. i.e. freezing them
for layer in vgg_model.layers:
layer.trainable = False
model3 = Sequential()
# Adding the convolutional part of the VGG16 model from above
model3.add(vgg_model)
# Flattening the output of the VGG16 model because it is from a convolutional layer
model3.add(Flatten())
# Adding a dense output layer
model3.add(Dense(1024, activation='relu'))
model3.add(BatchNormalization())
model3.add(Dense(512, activation='relu'))
model3.add(BatchNormalization())
model3.add(Dense(256, activation='relu'))
model3.add(Dropout(0.3))
model3.add(Dense(128, activation='relu'))
model3.add(Dropout(0.2))
model3.add(Dense(64, activation='relu'))
model3.add(Dense(32, activation='relu'))
model3.add(Dense(12, activation='softmax'))
# Compile model
model3.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
# Generating the summary of the model
model3.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 2, 2, 512) 14714688
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 1024) 2098176
batch_normalization (BatchN (None, 1024) 4096
ormalization)
dense_1 (Dense) (None, 512) 524800
batch_normalization_1 (Batc (None, 512) 2048
hNormalization)
dense_2 (Dense) (None, 256) 131328
dropout (Dropout) (None, 256) 0
dense_3 (Dense) (None, 128) 32896
dropout_1 (Dropout) (None, 128) 0
dense_4 (Dense) (None, 64) 8256
dense_5 (Dense) (None, 32) 2080
dense_6 (Dense) (None, 12) 396
=================================================================
Total params: 17,518,764
Trainable params: 2,801,004
Non-trainable params: 14,717,760
_________________________________________________________________
history_vgg16 = model3.fit(train_datagen.flow(X_train_normalized,y_train_encoded, #Using the augmented data
batch_size=64,
seed=1,
shuffle=False),
epochs=30,
steps_per_epoch=X_train_normalized.shape[0] // 64,
class_weight = class_weights,
validation_data=(X_val_normalized,y_val_encoded),
verbose=1)
Epoch 1/30 53/53 [==============================] - 8s 82ms/step - loss: 2.4217 - accuracy: 0.2989 - val_loss: 2.1155 - val_accuracy: 0.2579 Epoch 2/30 53/53 [==============================] - 5s 88ms/step - loss: 1.9090 - accuracy: 0.3865 - val_loss: 1.7171 - val_accuracy: 0.3816 Epoch 3/30 53/53 [==============================] - 4s 81ms/step - loss: 1.7249 - accuracy: 0.4482 - val_loss: 1.7644 - val_accuracy: 0.4000 Epoch 4/30 53/53 [==============================] - 4s 83ms/step - loss: 1.6724 - accuracy: 0.4636 - val_loss: 1.6037 - val_accuracy: 0.4079 Epoch 5/30 53/53 [==============================] - 5s 84ms/step - loss: 1.5974 - accuracy: 0.4741 - val_loss: 1.4688 - val_accuracy: 0.4921 Epoch 6/30 53/53 [==============================] - 4s 79ms/step - loss: 1.5385 - accuracy: 0.4946 - val_loss: 1.4840 - val_accuracy: 0.5132 Epoch 7/30 53/53 [==============================] - 4s 83ms/step - loss: 1.4641 - accuracy: 0.5304 - val_loss: 1.3137 - val_accuracy: 0.5368 Epoch 8/30 53/53 [==============================] - 4s 81ms/step - loss: 1.4674 - accuracy: 0.5265 - val_loss: 1.3541 - val_accuracy: 0.5500 Epoch 9/30 53/53 [==============================] - 4s 84ms/step - loss: 1.4491 - accuracy: 0.5384 - val_loss: 1.3170 - val_accuracy: 0.5237 Epoch 10/30 53/53 [==============================] - 4s 84ms/step - loss: 1.3901 - accuracy: 0.5602 - val_loss: 1.3010 - val_accuracy: 0.5474 Epoch 11/30 53/53 [==============================] - 4s 81ms/step - loss: 1.4045 - accuracy: 0.5513 - val_loss: 1.2746 - val_accuracy: 0.5447 Epoch 12/30 53/53 [==============================] - 4s 79ms/step - loss: 1.3363 - accuracy: 0.5700 - val_loss: 1.3079 - val_accuracy: 0.5447 Epoch 13/30 53/53 [==============================] - 5s 85ms/step - loss: 1.3273 - accuracy: 0.5694 - val_loss: 1.2260 - val_accuracy: 0.5684 Epoch 14/30 53/53 [==============================] - 4s 80ms/step - loss: 1.3068 - accuracy: 0.5724 - val_loss: 1.2754 - val_accuracy: 0.5211 Epoch 15/30 53/53 [==============================] - 4s 80ms/step - loss: 1.2689 - accuracy: 0.5915 - val_loss: 1.1751 - val_accuracy: 0.5816 Epoch 16/30 53/53 [==============================] - 4s 85ms/step - loss: 1.2934 - accuracy: 0.5805 - val_loss: 1.1743 - val_accuracy: 0.5816 Epoch 17/30 53/53 [==============================] - 4s 78ms/step - loss: 1.3241 - accuracy: 0.5816 - val_loss: 1.1136 - val_accuracy: 0.6026 Epoch 18/30 53/53 [==============================] - 4s 79ms/step - loss: 1.2164 - accuracy: 0.6129 - val_loss: 1.1652 - val_accuracy: 0.5947 Epoch 19/30 53/53 [==============================] - 4s 83ms/step - loss: 1.2231 - accuracy: 0.6019 - val_loss: 1.1871 - val_accuracy: 0.5763 Epoch 20/30 53/53 [==============================] - 4s 79ms/step - loss: 1.2217 - accuracy: 0.6073 - val_loss: 1.0540 - val_accuracy: 0.6105 Epoch 21/30 53/53 [==============================] - 4s 79ms/step - loss: 1.1854 - accuracy: 0.6103 - val_loss: 1.2280 - val_accuracy: 0.5763 Epoch 22/30 53/53 [==============================] - 4s 84ms/step - loss: 1.1707 - accuracy: 0.6177 - val_loss: 1.0867 - val_accuracy: 0.6053 Epoch 23/30 53/53 [==============================] - 4s 80ms/step - loss: 1.1782 - accuracy: 0.6111 - val_loss: 1.1552 - val_accuracy: 0.6105 Epoch 24/30 53/53 [==============================] - 4s 79ms/step - loss: 1.1761 - accuracy: 0.6162 - val_loss: 1.1574 - val_accuracy: 0.5842 Epoch 25/30 53/53 [==============================] - 4s 82ms/step - loss: 1.1548 - accuracy: 0.6213 - val_loss: 1.4548 - val_accuracy: 0.5132 Epoch 26/30 53/53 [==============================] - 4s 78ms/step - loss: 1.1699 - accuracy: 0.6323 - val_loss: 1.1550 - val_accuracy: 0.5947 Epoch 27/30 53/53 [==============================] - 4s 82ms/step - loss: 1.1396 - accuracy: 0.6335 - val_loss: 1.1482 - val_accuracy: 0.6105 Epoch 28/30 53/53 [==============================] - 4s 82ms/step - loss: 1.1216 - accuracy: 0.6293 - val_loss: 1.0338 - val_accuracy: 0.6237 Epoch 29/30 53/53 [==============================] - 4s 81ms/step - loss: 1.1303 - accuracy: 0.6326 - val_loss: 1.1416 - val_accuracy: 0.5737 Epoch 30/30 53/53 [==============================] - 4s 83ms/step - loss: 1.0821 - accuracy: 0.6532 - val_loss: 1.1372 - val_accuracy: 0.6026
plt.plot(history_vgg16.history['accuracy'])
plt.plot(history_vgg16.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
The model is not overfitting. In this aspect, it certainly is an improvement over models 1 & 2.
Despite the use of transfer learning, the model (model3) did not yeild desired accuracy rates. The accuracy rates are 65 and 60 for training and validation sets respectively.
accuracy = model3.evaluate(X_test_normalized, y_test_encoded, verbose=2)
30/30 - 0s - loss: 1.1599 - accuracy: 0.5863 - 451ms/epoch - 15ms/step
The model3 is giving ~59% accuracy on the test/unseen data.
# Here we would get the output as probablities for each category
y_pred3=model3.predict(X_test_normalized)
30/30 [==============================] - 1s 14ms/step
# Obtaining the categorical values from y_pred3
y_pred3_arg=np.argmax(y_pred3,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred3_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
Observation:
We observe that a few classes are not predicted correctly. Class 0 is mostly misclassified as class 6. Not a single seedling in this category was correctly identified by the model.
Besides class 0, Classes 1,4,5,8,9,10,11 have high misclassifications as well.
# Print the metrics
print(classification_report(y_test_arg,y_pred3_arg))
precision recall f1-score support
0 0.00 0.00 0.00 53
1 0.67 0.36 0.47 78
2 0.64 0.71 0.67 58
3 0.61 0.82 0.70 122
4 0.71 0.34 0.46 44
5 0.44 0.60 0.50 95
6 0.58 0.85 0.69 131
7 0.63 0.73 0.67 44
8 0.66 0.50 0.57 103
9 0.45 0.22 0.29 46
10 0.61 0.66 0.63 99
11 0.59 0.60 0.59 77
accuracy 0.59 950
macro avg 0.55 0.53 0.52 950
weighted avg 0.56 0.59 0.56 950
Class 0 has the lowest precision/recall/f1-scores compared to all other classes.
# Visualizing the predicted and actual label of images from test data
import math
plt.figure(figsize=(2,2))
plt.imshow(X_test[2])
plt.show()
print('Predicted Label', encoder.inverse_transform(model3.predict((X_test_normalized[2].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded)[2]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[33])
plt.show()
print('Predicted Label', encoder.inverse_transform(model3.predict((X_test_normalized[33].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded)[33]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[36])
plt.show()
print('Predicted Label', encoder.inverse_transform(model3.predict((X_test_normalized[36].reshape(1,64,64,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded)[36])
1/1 [==============================] - 0s 19ms/step Predicted Label ['Small-flowered Cranesbill'] True Label Small-flowered Cranesbill
1/1 [==============================] - 0s 18ms/step Predicted Label ['Fat Hen'] True Label Common Chickweed
1/1 [==============================] - 0s 19ms/step Predicted Label ['Loose Silky-bent'] True Label Loose Silky-bent
Inspite of the low accuracy rates, 2 out of the 3 random samples were classified correctly.
We will re-use the configuration of model3, but will use the original images 128x128 (vs 64x64 resized images used in model3).
# Clearing backend
backend.clear_session()
# Fixing the seed for random number generators
import random
np.random.seed(1)
random.seed(1)
tf.random.set_seed(1)
Using the original images to split and normalize the data.
X_temp, X_test_org, y_temp, y_test_org = train_test_split(np.array(images),labels , test_size=0.2, random_state=1,stratify=labels)
X_train_org, X_val_org, y_train_org, y_val_org = train_test_split(X_temp,y_temp , test_size=0.1, random_state=1,stratify=y_temp)
print(X_train_org.shape,y_train.shape)
print(X_val_org.shape,y_val.shape)
print(X_test_org.shape,y_test.shape)
# Convert labels from names to one hot vectors.
# Labelbinarizer works similar to onehotencoder
y_train_encoded_org = encoder.fit_transform(y_train_org)
y_val_encoded_org=encoder.transform(y_val_org)
y_test_encoded_org=encoder.transform(y_test_org)
# Normalizing the image pixels
X_train_normalized_org = X_train_org.astype('float32')/255.0
X_val_normalized_org = X_val_org.astype('float32')/255.0
X_test_normalized_org = X_test_org.astype('float32')/255.0
(3420, 128, 128, 3) (3420, 1) (380, 128, 128, 3) (380, 1) (950, 128, 128, 3) (950, 1)
vgg_model2 = VGG16(weights='imagenet', include_top = False, input_shape = (128,128,3)) # Using the original image size of 128x128
vgg_model2.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 128, 128, 3)] 0
block1_conv1 (Conv2D) (None, 128, 128, 64) 1792
block1_conv2 (Conv2D) (None, 128, 128, 64) 36928
block1_pool (MaxPooling2D) (None, 64, 64, 64) 0
block2_conv1 (Conv2D) (None, 64, 64, 128) 73856
block2_conv2 (Conv2D) (None, 64, 64, 128) 147584
block2_pool (MaxPooling2D) (None, 32, 32, 128) 0
block3_conv1 (Conv2D) (None, 32, 32, 256) 295168
block3_conv2 (Conv2D) (None, 32, 32, 256) 590080
block3_conv3 (Conv2D) (None, 32, 32, 256) 590080
block3_pool (MaxPooling2D) (None, 16, 16, 256) 0
block4_conv1 (Conv2D) (None, 16, 16, 512) 1180160
block4_conv2 (Conv2D) (None, 16, 16, 512) 2359808
block4_conv3 (Conv2D) (None, 16, 16, 512) 2359808
block4_pool (MaxPooling2D) (None, 8, 8, 512) 0
block5_conv1 (Conv2D) (None, 8, 8, 512) 2359808
block5_conv2 (Conv2D) (None, 8, 8, 512) 2359808
block5_conv3 (Conv2D) (None, 8, 8, 512) 2359808
block5_pool (MaxPooling2D) (None, 4, 4, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
# Making all the layers of the VGG model non-trainable. i.e. freezing them
for layer in vgg_model2.layers:
layer.trainable = False
model4 = Sequential()
# Adding the convolutional part of the VGG16 model from above
model4.add(vgg_model2)
# Flattening the output of the VGG16 model because it is from a convolutional layer
model4.add(Flatten())
# Adding a dense output layer
model4.add(Dense(1024, activation='relu'))
model4.add(BatchNormalization())
model4.add(Dense(512, activation='relu'))
model4.add(BatchNormalization())
model4.add(Dense(256, activation='relu'))
model4.add(Dropout(0.3))
model4.add(Dense(128, activation='relu'))
model4.add(Dropout(0.2))
model4.add(Dense(64, activation='relu'))
model4.add(Dense(32, activation='relu'))
model4.add(Dense(12, activation='softmax'))
# Compile model
model4.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
# Generating the summary of the model
model4.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 4, 4, 512) 14714688
flatten (Flatten) (None, 8192) 0
dense (Dense) (None, 1024) 8389632
batch_normalization (BatchN (None, 1024) 4096
ormalization)
dense_1 (Dense) (None, 512) 524800
batch_normalization_1 (Batc (None, 512) 2048
hNormalization)
dense_2 (Dense) (None, 256) 131328
dropout (Dropout) (None, 256) 0
dense_3 (Dense) (None, 128) 32896
dropout_1 (Dropout) (None, 128) 0
dense_4 (Dense) (None, 64) 8256
dense_5 (Dense) (None, 32) 2080
dense_6 (Dense) (None, 12) 396
=================================================================
Total params: 23,810,220
Trainable params: 9,092,460
Non-trainable params: 14,717,760
_________________________________________________________________
history_vgg16_2 = model4.fit(train_datagen.flow(X_train_normalized_org,y_train_encoded_org,
batch_size=64,
seed=1,
shuffle=False),
epochs=30,
steps_per_epoch=X_train_normalized_org.shape[0] // 64,
class_weight = class_weights,
validation_data=(X_val_normalized_org,y_val_encoded_org),
verbose=1)
Epoch 1/30 53/53 [==============================] - 16s 231ms/step - loss: 2.3622 - accuracy: 0.2880 - val_loss: 2.8096 - val_accuracy: 0.1474 Epoch 2/30 53/53 [==============================] - 13s 237ms/step - loss: 1.7866 - accuracy: 0.4240 - val_loss: 1.6868 - val_accuracy: 0.4263 Epoch 3/30 53/53 [==============================] - 13s 238ms/step - loss: 1.5508 - accuracy: 0.5018 - val_loss: 2.2782 - val_accuracy: 0.3711 Epoch 4/30 53/53 [==============================] - 12s 232ms/step - loss: 1.4315 - accuracy: 0.5396 - val_loss: 1.4398 - val_accuracy: 0.4974 Epoch 5/30 53/53 [==============================] - 13s 238ms/step - loss: 1.3193 - accuracy: 0.5766 - val_loss: 1.1945 - val_accuracy: 0.5895 Epoch 6/30 53/53 [==============================] - 13s 239ms/step - loss: 1.2897 - accuracy: 0.5945 - val_loss: 1.0653 - val_accuracy: 0.6289 Epoch 7/30 53/53 [==============================] - 13s 244ms/step - loss: 1.1990 - accuracy: 0.6278 - val_loss: 1.1182 - val_accuracy: 0.6132 Epoch 8/30 53/53 [==============================] - 13s 240ms/step - loss: 1.1515 - accuracy: 0.6389 - val_loss: 1.1267 - val_accuracy: 0.6368 Epoch 9/30 53/53 [==============================] - 12s 237ms/step - loss: 1.1470 - accuracy: 0.6421 - val_loss: 0.9233 - val_accuracy: 0.6605 Epoch 10/30 53/53 [==============================] - 13s 236ms/step - loss: 1.0694 - accuracy: 0.6636 - val_loss: 1.2980 - val_accuracy: 0.5553 Epoch 11/30 53/53 [==============================] - 12s 235ms/step - loss: 1.0402 - accuracy: 0.6654 - val_loss: 0.9684 - val_accuracy: 0.6632 Epoch 12/30 53/53 [==============================] - 13s 238ms/step - loss: 1.0589 - accuracy: 0.6615 - val_loss: 1.0189 - val_accuracy: 0.6447 Epoch 13/30 53/53 [==============================] - 13s 238ms/step - loss: 1.0138 - accuracy: 0.6707 - val_loss: 1.0787 - val_accuracy: 0.6342 Epoch 14/30 53/53 [==============================] - 13s 238ms/step - loss: 0.9747 - accuracy: 0.6952 - val_loss: 0.9009 - val_accuracy: 0.6921 Epoch 15/30 53/53 [==============================] - 13s 239ms/step - loss: 1.0062 - accuracy: 0.6880 - val_loss: 0.9131 - val_accuracy: 0.6789 Epoch 16/30 53/53 [==============================] - 12s 234ms/step - loss: 0.9519 - accuracy: 0.7017 - val_loss: 0.8832 - val_accuracy: 0.6842 Epoch 17/30 53/53 [==============================] - 13s 239ms/step - loss: 0.9666 - accuracy: 0.6999 - val_loss: 0.9719 - val_accuracy: 0.6737 Epoch 18/30 53/53 [==============================] - 12s 233ms/step - loss: 0.9153 - accuracy: 0.7014 - val_loss: 0.7826 - val_accuracy: 0.7211 Epoch 19/30 53/53 [==============================] - 12s 234ms/step - loss: 0.9008 - accuracy: 0.7145 - val_loss: 0.9260 - val_accuracy: 0.6579 Epoch 20/30 53/53 [==============================] - 13s 234ms/step - loss: 0.9081 - accuracy: 0.7017 - val_loss: 0.8554 - val_accuracy: 0.6895 Epoch 21/30 53/53 [==============================] - 13s 238ms/step - loss: 0.9350 - accuracy: 0.7041 - val_loss: 0.8462 - val_accuracy: 0.6816 Epoch 22/30 53/53 [==============================] - 12s 233ms/step - loss: 0.8162 - accuracy: 0.7348 - val_loss: 1.0182 - val_accuracy: 0.6658 Epoch 23/30 53/53 [==============================] - 12s 235ms/step - loss: 0.8595 - accuracy: 0.7184 - val_loss: 1.1200 - val_accuracy: 0.6158 Epoch 24/30 53/53 [==============================] - 12s 234ms/step - loss: 0.8521 - accuracy: 0.7274 - val_loss: 0.7737 - val_accuracy: 0.7368 Epoch 25/30 53/53 [==============================] - 13s 239ms/step - loss: 0.8370 - accuracy: 0.7211 - val_loss: 0.7516 - val_accuracy: 0.7632 Epoch 26/30 53/53 [==============================] - 12s 234ms/step - loss: 0.8130 - accuracy: 0.7348 - val_loss: 0.7314 - val_accuracy: 0.7368 Epoch 27/30 53/53 [==============================] - 13s 237ms/step - loss: 0.7873 - accuracy: 0.7402 - val_loss: 0.8943 - val_accuracy: 0.6789 Epoch 28/30 53/53 [==============================] - 13s 236ms/step - loss: 0.8205 - accuracy: 0.7360 - val_loss: 0.7719 - val_accuracy: 0.7395 Epoch 29/30 53/53 [==============================] - 13s 239ms/step - loss: 0.8078 - accuracy: 0.7333 - val_loss: 0.7455 - val_accuracy: 0.7579 Epoch 30/30 53/53 [==============================] - 13s 236ms/step - loss: 0.7713 - accuracy: 0.7548 - val_loss: 0.7156 - val_accuracy: 0.7553
plt.plot(history_vgg16_2.history['accuracy'])
plt.plot(history_vgg16_2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
The model4 is not overfitting and produced good accuracy rates (75% and 75%) across both the training and validation datasets.
accuracy = model4.evaluate(X_test_normalized_org, y_test_encoded_org, verbose=2)
30/30 - 1s - loss: 0.7884 - accuracy: 0.7326 - 1s/epoch - 44ms/step
The model4 is giving ~73% accuracy with the test/unseen data - the best rate across all the 4 models.
# Here we would get the output as probablities for each category
y_pred4=model4.predict(X_test_normalized_org)
30/30 [==============================] - 1s 44ms/step
# Obtaining the categorical values from y_pred4
y_pred4_arg=np.argmax(y_pred4,axis=1)
y_test_org_arg=np.argmax(y_test_encoded_org,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_org_arg,y_pred4_arg)
f, ax = plt.subplots(figsize=(10, 8))
sns.heatmap(
confusion_matrix,
annot=True,
linewidths=.4,
fmt="d",
square=True,
ax=ax
)
plt.show()
Observation:
We observe that most of the classes are predicted correctly.
Class 0 remains to be the class with the most number of misclassifications. The model was able to correctly predict only 11 out of the 53 samples of class 0. There are some misclassifications in classes 4, 8 and 11 as well. Besides these 4 classes, the model seems to have classified the rest of the classes reasonably well.
Overall, this seems to be our best model.
# Print the metrics
print(classification_report(y_test_org_arg,y_pred4_arg))
precision recall f1-score support
0 0.46 0.21 0.29 53
1 0.84 0.67 0.74 78
2 0.76 0.81 0.78 58
3 0.88 0.90 0.89 122
4 0.66 0.52 0.58 44
5 0.64 0.73 0.68 95
6 0.57 0.95 0.71 131
7 0.84 0.82 0.83 44
8 0.86 0.67 0.75 103
9 0.64 0.50 0.56 46
10 0.95 0.76 0.84 99
11 0.74 0.73 0.73 77
accuracy 0.73 950
macro avg 0.74 0.69 0.70 950
weighted avg 0.75 0.73 0.73 950
Leaving aside the metrics for class 0,the precision, recall and f1-scores from model4 are superior to the other 3 models.
# Visualizing the predicted and actual label of images from test data
import math
plt.figure(figsize=(2,2))
plt.imshow(X_test_org[2])
plt.show()
print('Predicted Label', encoder.inverse_transform(model4.predict((X_test_normalized_org[2].reshape(1,128,128,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded_org)[2]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[33])
plt.show()
print('Predicted Label', encoder.inverse_transform(model4.predict((X_test_normalized_org[33].reshape(1,128,128,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded_org)[33]) # using inverse_transform() to get the output label from the output vector
plt.figure(figsize=(2,2))
plt.imshow(X_test[36])
plt.show()
print('Predicted Label', encoder.inverse_transform(model4.predict((X_test_normalized_org[36].reshape(1,128,128,3))))) # reshaping the input image as we are only trying to predict using a single image
print('True Label', encoder.inverse_transform(y_test_encoded_org)[36])
1/1 [==============================] - 0s 21ms/step Predicted Label ['Small-flowered Cranesbill'] True Label Small-flowered Cranesbill
1/1 [==============================] - 0s 19ms/step Predicted Label ['Common Chickweed'] True Label Common Chickweed
1/1 [==============================] - 0s 20ms/step Predicted Label ['Loose Silky-bent'] True Label Loose Silky-bent
All the 3 samples that we randomly picked have been predicted correctly.
pd.DataFrame({'Models':['Base CNN Model','CNN Model with Data Augmentation','Transfer Learning Model with resized images', 'Transfer Learning Model with original images'],'Train Accuracy':['100%','77%','65%','75%'],'Validation Accuracy':['73%','61%','60%', '75%'],'Test Accuracy':['68%','63%','59%', '73%']})
| Models | Train Accuracy | Validation Accuracy | Test Accuracy | |
|---|---|---|---|---|
| 0 | Base CNN Model | 100% | 73% | 68% |
| 1 | CNN Model with Data Augmentation | 77% | 61% | 63% |
| 2 | Transfer Learning Model with resized images | 65% | 60% | 59% |
| 3 | Transfer Learning Model with original images | 75% | 75% | 73% |
Conclusion
The dataset is imbalanced. Class weights were used to handle the data imbalance.
Data Augmentation was used to create more data samples.
It appeared that GaussianBlur would be ineffective because the denoised image did not seem to contain any relevant information. Since the model would struggle to categorize these blurred images, gaussian blur was not employed.
Four models were built to predict the class of the seedling - 2 ordinary CNN models (model1 and model2) and 2 models built using pre-trained VGG16 (model3 and model4). Models 1, 2 and 3 were configured to use the resized images (64x64). Model 4 was configured to use the original images of size 128x128. Model4 predicted the majority of the classes better than the other 3 models.
The test accuracy of the VGG16 model built using the original images (model4) is 73%. The very same configuration(VGG16) using resized images (model3) yeilded lower accuracy values (59% accuracy on the test set).
Both the VGG16 models (models 3 & 4) solved the issue of overfitting that was quite prominent in vanilla CNN models (models 1 & 2).
All the 4 models exhibited difficulty in correctly classifying Black-grass(class 0) - the training data had 189 samples of Black-grass (5.5%). The models misclassified 'Black-grass' as 'Loose silky-bent' (class 6) which was the majority of the sampleset (471 - ~14%). Having balanaced dataset with more samples of Black-grass will definitely help the prediction rates.
Simply employing the transfer learning model produced a better/consistant outcome than an ordinary CNN.